A decentralized backend storage architecture that distributes encrypted and erasure-coded data across a collection of different types of storage nodes, designed for privacy, resilience, and performance.
PeerStor is a fully developed peer-to-peer (P2P) storage system that addresses the critical limitations of centralized cloud storage providers. It provides end-to-end encryption, transparent deduplication, automatic fault tolerance, and high throughput while maintaining zero-knowledge about your data.
- End-to-End Encryption: AES-256-GCM authenticated encryption ensures that data is secure both in transit and at rest
- Content-Defined Chunking (CDC): Using Rabin fingerprinting for intelligent file fragmentation and automatic deduplication
- Reed-Solomon Erasure Coding: Survive up to 6 concurrent node failures with only 60% storage overhead
- Kademlia DHT: Distributed hash table for O(log N) peer discovery and lookups
- Multi-Protocol Support: HTTP/1.1, WebDAV, SFTP, and FTP/FTPS
- Zero-Knowledge Architecture: The storage infrastructure knows nothing about your data
- High Performance: 175 MiB/s upload and 190 MiB/s download throughput on commodity hardware
- Graceful Degradation: Maintains performance even with 30% node failures
PeerStor implements a five-layer modular architecture for independent scalability:
┌─────────────────────────────────────────────────┐
│ Protocol Layer (HTTP/1.1, WebDAV, FTP/FTPS) │
├─────────────────────────────────────────────────┤
│ Fragmentation Layer (CDC + AES-256-GCM) │
├─────────────────────────────────────────────────┤
│ DHT Layer (Kademlia, XOR Routing) │
├─────────────────────────────────────────────────┤
│ Storage Layer (SQLite Index, LRU Cache) │
├─────────────────────────────────────────────────┤
│ Network Layer (NAT Traversal, TLS 1.3) │
└─────────────────────────────────────────────────┘
Protocol Layer: Provides standard interfaces for clients (web browsers, curl, file managers) through HTTP/1.1, WebDAV, and FTP/FTPS protocols with resumable uploads and chunk-level acknowledgments.
Fragmentation Layer: Implements Content-Defined Chunking using Rabin Fingerprints, encrypts each chunk with AES-256-GCM, computes SHA-256 content hashes, and applies Reed-Solomon Erasure Encoding.
DHT Layer: Implements the Kademlia Protocol with 160-bit Node IDs, maintains k-buckets for peers, and enables iterative parallel lookups with O(log N) latency.
Storage Layer: Maintains chunks in persistent content-addressable storage indexed by SQLite, utilizes LRU caching for frequently accessed chunks, and enforces storage quotas.
Network Layer: Manages TCP and UDP multiplexing, STUN-based NAT traversal, TLS 1.3 encryption, and adaptive congestion control.
PeerStor uses Rabin fingerprinting for intelligent file fragmentation instead of fixed-size chunks, enabling superior deduplication:
- Time Complexity: O(S) where S is the file size
- Space Complexity: O(1) constant space beyond the window size
- Deduplication Ratio: Up to 90% bandwidth savings on incremental updates
- Expected Chunk Size: 4 MiB (configurable min/max bounds)
Divides files into k data fragments and produces n-k parity fragments, allowing recovery from any n-k simultaneous losses:
- Default Configuration: k=10, n=16 (60% storage overhead, tolerates 6 failures)
- Encoding Time: O(n·k·L) where L is chunk size in bytes (~500 MiB/s throughput)
- Decoding Time: O(k·L) for typical file sizes
- Durability: 6.9 nines with default configuration
Distributed hash table for peer discovery and routing with logarithmic lookup complexity:
- Lookup Latency: O(log₂ N) where N is network size
- Message Complexity: O(α·log N) per lookup (α=3 parallel queries)
- Routing Table: O(k·log N) entries per node (k=20 peers per bucket)
Authenticated encryption providing both confidentiality and integrity:
- Key Derivation: PBKDF2-SHA256 with 100,000+ iterations
- Encryption Throughput: 3-5 GiB/s on modern CPUs with AES-NI
- Authentication Tag: 128-bit GCM authentication tags prevent tampering
- Nonce: Unique per chunk to prevent replay attacks
- Upload: 175 MiB/s
- Download: 190 MiB/s
| Condition | Upload | Download | Degradation |
|---|---|---|---|
| Gigabit, 0ms | 175 MiB/s | 190 MiB/s | --- |
| 100 Mbps throttle | 85 MiB/s | 92 MiB/s | -51% |
| 50ms latency | 168 MiB/s | 182 MiB/s | -5% |
| 100ms latency | 160 MiB/s | 175 MiB/s | -10% |
| 1% packet loss | 170 MiB/s | 185 MiB/s | -3% |
| 5% packet loss | 155 MiB/s | 170 MiB/s | -12% |
| Peers (N) | Lookup (ms) | log₂ N | Ratio |
|---|---|---|---|
| 100 | 5 | 6.6 | 0.76 |
| 1,000 | 8 | 10.0 | 0.80 |
| 10,000 | 12 | 13.3 | 0.90 |
| 100,000 | 15 | 16.6 | 0.90 |
| 1,000,000 | 18 | 20.0 | 0.90 |
Download throughput degrades linearly with peer failures:
- 10% failures: 171 MiB/s (90% retained)
- 20% failures: 152 MiB/s (80% retained)
- 30% failures: 133 MiB/s (70% retained)
- 50% failures: 95 MiB/s (50% retained)
| Feature | PeerStor | IPFS | Storj | Tahoe-LAFS |
|---|---|---|---|---|
| End-to-End Encryption | ✓ | ✗ | ✓ | ✓ |
| CDC Deduplication | ✓ | ✓ | ✗ | ✗ |
| Erasure Coding | ✓ | ✗ | ✓ | ✓ |
| Zero-Knowledge | ✓ | ✗ | Partial | ✓ |
| Multi-Protocol | ✓ | ✗ | ✗ | ✗ |
| Throughput (MiB/s) | 190 | 50-100 | 100-200 | 50-100 |
| Blockchain Required | No | Optional | Yes | No |
PeerStor implements a zero-knowledge architecture where:
- The storage service knows only the encrypted blob size
- All encryption keys remain exclusively with the client
- Server-side deduplication uses content hashes of encrypted data
- Parity fragments reveal no plaintext information
The system tolerates attackers with compromised access to up to f < k peers, allowing them to:
- Read encrypted data (confidential due to AES-256-GCM)
- Observe traffic patterns
- Collude with other compromised peers
- Confidentiality: AES-256-GCM encryption ensures no plaintext exposure
- Integrity: GCM authentication tags prevent modification attacks
- Collision Resistance: SHA-256 provides computational collision resistance
- Erasure Resilience: Reed-Solomon codes ensure data survivability
pip install peerstorgit clone https://github.com/yourusername/peerstor.git
cd peerstor
pip install -e .# Copy the service file
sudo cp contrib/systemd/peerstor@.service /etc/systemd/system/
# Enable and start the service
sudo systemctl enable peerstor@default.service
sudo systemctl start peerstor@default.servicepeerstor --listen 0.0.0.0:8080 -v /home/user/storage::rw# Mount via WebDAV (Linux)
mount -t davfs http://localhost:8080/d /mnt/peerstor
# Or access directly via web browser
# http://localhost:8080# Upload a file
curl -T myfile.txt http://localhost:8080/d/
# Download a file
curl http://localhost:8080/d/myfile.txt > myfile.txt
# List files
curl http://localhost:8080/d/Create a config file at ~/.config/peerstor/peerstor.conf:
# Server settings
--listen 0.0.0.0:8080
# Storage volumes
-v /home/user/storage::rw
-v /mnt/backup::ro
# DHT settings
--dht-port 9999
# Performance tuning
--max-upload-chunk 50m
--max-concurrent-uploads 16
# Logging
--log-level info
--logfile /var/log/peerstor.log
For comprehensive configuration options:
peerstor --helpThe upload process follows this pipeline:
- Upload Request: Client initiates file upload
- CDC Chunking: File split using Rabin fingerprinting
- AES-256 Encryption: Each chunk encrypted independently
- Reed-Solomon: Erasure codes generated
- DHT Distribution: Fragments distributed to storage peers
- Adjust
--read-aheadfor better throughput on high-latency networks - Configure
--timeoutbased on your network conditions - Enable
--enable-compressionfor limited bandwidth environments
- Use SSD for index databases (SQLite)
- Configure LRU cache size with
--cache-size - Monitor disk I/O with system tools
- Adjust chunk size with
-cflag (larger chunks = fewer chunks, less overhead) - Enable parallel encoding with
--cpu-threads - Use
--cpu-affinityto pin threads to specific cores
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
For security vulnerabilities, please see SECURITY.md.
PeerStor is released under the MIT License. See LICENSE for details.
We are committed to providing a welcoming and inclusive environment. See CODE_OF_CONDUCT.md.
This project is based on the IEEE conference paper:
"PeerStor: A Decentralized Cloud Storage Architecture with Content-Defined Chunking and Reed-Solomon Erasure Coding"
Authors: Dr. Vivek Parashar, Divyansh Joshi, Priyanshu Priyam Institution: VIT Bhopal University, Bhopal, India
The paper provides comprehensive algorithmic complexity analysis, formal security proofs, and extensive empirical evaluation on commodity hardware.
- 175 MiB/s upload throughput on commodity hardware
- 190 MiB/s download throughput
- 133 MiB/s download throughput under 30% node failures
- 6.9 nines durability with Reed-Solomon (10,16) configuration
- O(log N) peer discovery with Kademlia DHT
- 90% bandwidth savings on incremental updates via CDC
PeerStor builds upon decades of research in distributed systems, cryptography, and erasure coding. We acknowledge the contributions of:
- Rabin fingerprinting for content-defined chunking
- Reed-Solomon codes for erasure resilience
- Kademlia protocol for distributed hash tables
- AES-NI for cryptographic acceleration
- The open-source community for fundamental libraries